feat(cli): add hyperframes auth OAuth (PKCE + loopback + refresh)#1084
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
2605d59 to
733ea2e
Compare
|
Self-review pass on the OAuth work — addressed 12 of 15 findings. Force-push above includes the fixes. Fixed (high severity):
Fixed (medium severity): Deferred (won't fix in this PR):
Tests: 89 unit tests, all green. Lint/format/typecheck/fallow clean. |
a72b6ac to
5c81d96
Compare
733ea2e to
fa489bf
Compare
5c81d96 to
da2c2ea
Compare
fa489bf to
057c373
Compare
miguel-heygen
left a comment
There was a problem hiding this comment.
PKCE implementation looks correct — S256, proper verifier length, timing-safe state comparison, loopback bound to 127.0.0.1 only. The refresh/revocation flows are well thought out, especially preserving the RT on no-rotation refresh. Good stuff.
Two things worth thinking about:
Expired auth code error messaging — In exchangeCodeForTokens, a non-2xx response throws ErrApi which surfaces as a generic "HeyGen API error (400)". If the authorization code expires during the 120s loopback window (or was already used), the user gets a cryptic message after waiting a while. refreshTokens already special-cases 400/401 into REFRESH_FAILED — doing something similar for code exchange would make the failure a lot less confusing.
Concurrent refresh race — Two simultaneous CLI invocations hitting 401 will both try to refresh. The second one will likely get invalid_grant if the server invalidates the old RT on first use. Totally acceptable for a CLI tool, but if token rotation is ever enabled server-side, the second process silently loses its session. A file advisory lock would prevent it, but that's probably overkill for now — just worth being aware of.
Minor observation: assertOAuthConfiguredOrExit can never actually fire in production since DEFAULT_CLIENT_ID is always non-empty. The guard only protects against someone explicitly setting HYPERFRAMES_OAUTH_CLIENT_ID="". Not a problem, just means the OAUTH_NOT_CONFIGURED error path is effectively dead code.
Looks good to me.
vanceingalls
left a comment
There was a problem hiding this comment.
Suggested status: Request changes.
Findings:
- Blocking/security: OAuth token-endpoint error bodies are surfaced through
safeText()without the credential scrubbing used byAuthClient(packages/cli/src/auth/oauth.ts:161,packages/cli/src/auth/oauth.ts:272,packages/cli/src/auth/oauth.ts:372). These requests carryrefresh_token, authorization code, andcode_verifierin the form body; server/proxy error pages can echo request data. Please share the scrubber or add equivalent redaction here, with a test proving refresh/login errors cannot print token-shaped secrets. - Blocking/correctness:
persistOAuth()preserves priorrefresh_token/scope/token_type whenever the current response omits them (packages/cli/src/auth/oauth.ts:340), but it is also used by fresh authorization-code login (packages/cli/src/auth/oauth.ts:136). A new interactive login that omitsrefresh_tokencan pair a new access token with the previous session refresh token, then silently switch back or fail on the next refresh. Preserve missing refresh tokens only for refresh-grant persistence; fresh login should overwrite the OAuth block apart from preserving the co-located API key. - Downstack: this PR inherits the API-key rollback issue from #1081, so it should stay blocked until that is fixed too.
da2c2ea to
106b12f
Compare
057c373 to
fca9875
Compare
|
Thanks both — addressed the feedback. Force-pushed (still signed/verified). Blocking (vanceingalls):
Non-blocking (miguel):
Downstack #1081's rollback issue is also fixed, so this should unblock. |
miguel-heygen
left a comment
There was a problem hiding this comment.
Token exchange error handling addressed — 400/401 on code exchange now surfaces an actionable message instead of a bare API error. Good.
Concurrent refresh race and assertOAuthConfiguredOrExit dead code are cosmetic at this point. Ship it.
vanceingalls
left a comment
There was a problem hiding this comment.
Suggested status: Request changes.
The earlier OAuth-specific blockers are fixed: token-endpoint bodies now go through shared scrubCredentials, and fresh auth-code login uses preserveMissing: false while refresh uses preserveMissing: true.
Blocking issue:
- packages/cli/src/auth/scrub.ts:13 -
HEADER_LINEonly redacts the first whitespace-delimited word after Authorization:, soAuthorization: Bearer at_secret_123is scrubbed toauthorization: <redacted> at_secret_123. BecauseAuthClientsends OAuth credentials asAuthorization: Bearer <access_token>, an upstream/proxy error echoing that header still leaks opaque access tokens. Please redact the full header value and cover this with a test that checks the token afterBeareris gone.
106b12f to
bb59411
Compare
fca9875 to
e0bc7cd
Compare
|
Fixed the header-redaction leak in the shared scrubber — thanks. Force-pushed (still signed/verified).
Redaction stops at the line break, so unrelated following lines survive. Added a |
miguel-heygen
left a comment
There was a problem hiding this comment.
Scrub module properly handles all credential-shaped substrings now — form-encoded fields, JSON fields, headers, JWTs. Good.
vanceingalls
left a comment
There was a problem hiding this comment.
Suggested status: Approve from code review; wait for Graphite mergeability to finish before merge.
Re-review of e0bc7cd: prior OAuth blockers are fixed. Token endpoint errors use the shared scrubber, fresh authorization-code login no longer inherits a stale refresh_token, refresh still preserves a non-rotated refresh_token, and the full Authorization: Bearer <token> scrub regression is covered.
Verification run locally with Node 22 in PATH:
bun run --filter @hyperframes/cli test -- src/auth/client.test.ts src/auth/scrub.test.ts src/auth/oauth.test.ts src/commands/auth/login.test.tsbunx oxfmt --checkon touched auth/OAuth filesbunx oxlinton touched auth/OAuth files
bb59411 to
9272528
Compare
e0bc7cd to
ed429e1
Compare
124bfb8 to
3383bfe
Compare
9272528 to
8a9291c
Compare
3383bfe to
83d84f6
Compare
miguel-heygen
left a comment
There was a problem hiding this comment.
PKCE implementation is spec-correct:
- Verifier: 64 random bytes → 86 base64url chars (RFC 7636 compliant)
- Challenge: SHA-256 + base64url, method S256
- State: 32 random bytes + timingSafeEqual comparison
- Token storage: fresh login overwrites OAuth block, refresh preserves missing refresh_token (per RFC 6749 §6)
- Credential scrubbing expanded to cover form-encoded, JSON-encoded, and JWT patterns
- 401 auto-retry: fires only for OAuth with refresh_token, does not retry the retry
80 tests covering PKCE math, loopback edge cases (state mismatch, IdP error, timeout, 404), token persistence, refresh semantics, scrubber, and 401 retry logic.
Minor nits: assertOAuthConfiguredOrExit + resolveClientId double-checks the same condition; reportIdentity re-reads credentials instead of using the just-returned tokens. Neither blocking.
vanceingalls
left a comment
There was a problem hiding this comment.
Review
OAuth + PKCE foundation looks solid. Module boundaries are clean (pkce / loopback / browser / oauth / scrub / store), and the security-sensitive bits are right: crypto.randomBytes + S256 PKCE, 32-byte state with timingSafeEqual comparison, 127.0.0.1 (not localhost) + ephemeral port, 0600/0700 file modes, code_verifier kept in memory only, no secret logging on the success path, redirect_uri byte-identical across both hops (RFC 6749 §4.1.3). The 80-test suite covers the state-mismatch / IdP-error / missing-code / timeout / non-callback-path matrix cleanly, the _test-utils.ts factor-out is nice, and the refresh / preserveMissing semantics are documented well enough that the next reader won't re-derive RFC 6749 §6.
Three importants + a handful of nits below.
blocker / important
important — auth login will hang on a real browser due to HTTP keep-alive (packages/cli/src/auth/loopback.ts:171-178)
server.close() only refuses new connections; it doesn't terminate existing keep-alive sockets. Browsers issuing the callback default to Connection: keep-alive and idle the TCP connection for minutes (Chrome ~5min). I reproduced this with a stub http.Agent({ keepAlive: true }):
calling server.close()...
server.close callback fired in 0ms
still alive after 4005ms — keep-alive blocks event loop
respond() emits only content-type + cache-control, no Connection: close. End result: after the user sees "✓ Signed in as …", the CLI process hangs for the duration of the browser's keep-alive idle timeout. The unchecked smoke-test box in the PR description would catch this.
Two-line fix in respond():
res.writeHead(status, {
"content-type": "text/html; charset=utf-8",
"cache-control": "no-store",
connection: "close",
})Or call server.closeAllConnections() (Node ≥18.2) from close() after server.close(…).
important — scrubCredentials misses sk_V2_… keys (packages/cli/src/auth/scrub.ts:12)
The HEYGEN_KEY pattern is hg_[A-Za-z0-9_-]{4,} only, but this codebase already documents that real keys come in sk_V2_…, hg_…, and partner formats (store.ts:241, login.ts:18, store.test.ts:149). The HEADER_LINE regex catches keys when they appear after Authorization:/x-api-key:, but a JSON or text body that echoes a key inline ({"echoed_key":"sk_V2_…"}, a stack trace fragment, etc.) would leak the full key. That's exactly the threat model the scrubber is written for. Add sk_[A-Za-z0-9_-]{4,} (or unified (hg|sk)_…) to the pattern set and add a scrub.test.ts case.
important — 401-retry uses a stale refresh_token for rotated-RT servers (packages/cli/src/auth/client.ts:128-133)
const refreshed = await this.tryRefresh(credential.refresh_token);
if (refreshed) {
const next: ResolvedCredential = { ...credential, access_token: refreshed };
return await this.fetchUser(url, next, false);
}tryRefresh (via refreshTokens → persistOAuth) writes the new refresh_token to disk, but the in-memory credential object reused for the retry still carries the OLD refresh_token. For an IdP that rotates refresh tokens, a subsequent retry on the same credential instance would attempt to refresh with a now-invalidated RT.
Today's only call site (status) doesn't retry twice on the same credential, so this is latent. But the comment on getCurrentUser says "future endpoints inherit it for free" — those future endpoints will eat this. Either change the hook signature to return the full OAuthTokens (or a fresh ResolvedCredential), or have the hook also be responsible for re-reading from store and returning a fresh credential. A test that asserts the post-refresh credential carries the rotated RT would also catch this on the next refactor.
nit
nit — MIN_EXPIRES_IN_SECONDS=30 is less than EXPIRY_SKEW_MS=60_000 (oauth.ts:51, resolver.ts:42)
The clamp's comment says it prevents an immediate-refresh loop, but with the resolver's 60s skew, any expires_in ≤ 90s lands an expires_at the resolver immediately tags as refreshable: true. Today there's no proactive-refresh path so it doesn't actually loop (the 401-retry only fires on a real 401), so this is cosmetic — but if a future change adds proactive refresh on refreshable, you'd hit the loop the clamp is supposed to prevent. Set MIN_EXPIRES_IN_SECONDS = (EXPIRY_SKEW_MS/1000) + 30 or document the dependency.
nit — persistOAuth is read-modify-write without a lock (oauth.ts:400-419)
Two concurrent hyperframes auth refresh or auth login --api-key + auth refresh processes will race: each reads the store, writes a merged result via temp+rename. Last-write-wins. The window is short and writeStore is atomic per file, but you could lose the new refresh_token of one of the two. Not a security issue (just a UX one — user re-runs login), and out of scope for this PR, but worth a tracking comment / future flock.
nit — server.address() as AddressInfo defensive-cast (loopback.ts:72)
server.address() can return string | AddressInfo | null per the type defs. After a successful listen() on an IP/port it'll be AddressInfo, so the cast holds in practice — but a defensive if (!address || typeof address === "string") throw … would surface a programmer error rather than a Cannot read property 'port' of null if listen semantics ever drift.
nit — revokeTokens swallows abort/timeout silently (oauth.ts:248)
The catch {} makes diagnosis of "logout completed but server still thinks I'm logged in" hard. Logout's a rare interactive command — even a c.dim() one-liner ("Could not contact revoke endpoint; local session cleared.") would be helpful. The !res.ok branch already does this; the network-error branch should match.
nit — refresh command exits 1 only on REFRESH_FAILED; other AuthErrors throw (commands/auth/refresh.ts:62-66)
API_ERROR (5xx) bubbles past the isAuthError branch and crashes with a stack trace via the default citty handler. Consider treating any isAuthError(err) as a friendly-exit case here.
nit — auth login no longer rolls back OAuth state on identity-check failure (commands/auth/login.ts:65-78)
The api-key path snapshots + rolls back on 401. The OAuth path persists tokens and then warns "transient" on any verify error including 401. A 401 from /v3/users/me on tokens we just minted is more likely "the IdP issued, the API doesn't yet recognize" — your current "don't roll back, this is transient" is defensible — but you might want a 401-specific branch that surfaces "Signed in but identity check returned 401; run auth status in a moment" rather than the generic warn.
what's right
- PKCE:
randomBytes(64)→ 86-char URL-safe verifier,base64url(SHA256(verifier))challenge, S256 method. - State: 256-bit entropy,
timingSafeEqualcomparison. - Loopback:
127.0.0.1notlocalhost, ephemeral port,/oauth/callbackexact-path match (other paths → 404 without leaking the callback), method allow-list (GET only),cache-control: no-store. - Error-page HTML escapes IdP-supplied
error/error_description(XSS). redirectUricaptured once and reused — no reconstruction fromreq.socket.localAddress.code_verifiernever logged;Opening browser to <host+path>strips the query string deliberately.- 0600 file / 0700 dir; atomic temp+rename writes.
parseTokenResponsevalidates header-safety on tokens before they touch a future request.persistOAuthdiscipline (preserveMissingon refresh, not on fresh login) is correctly reasoned out — landed me on the exact RFC 6749 §6 rotation scenario.- Logout revokes refresh_token first then access_token, AbortController with 5s, never throws.
Approve once the keep-alive close fix and the sk_ scrub gap are in.
— Vai
83d84f6 to
81aff68
Compare
|
Thanks @vanceingalls — all three blockers fixed. Force-pushed (signed/verified). 1. Keep-alive hang on
Verified live: after successful consent, the CLI process exits promptly instead of waiting on the browser's idle timeout. 2. 3. 401-retry used stale Note on the rebase: #1081 merged earlier today, so this branch now rebases onto Skipped the smaller nits this round (revokeTokens log, refresh command's API_ERROR handling, MIN_EXPIRES_IN_SECONDS coupling) to keep this PR focused on the blockers — happy to do them as follow-ups if you want. 107 tests green, lint/format/typecheck/fallow clean. |
vanceingalls
left a comment
There was a problem hiding this comment.
Re-review on 81aff683
All three prior blockers are addressed cleanly.
Keep-alive hang fixed (loopback.ts:86, loopback.ts:178-189). Belt-and-suspenders — Connection: close on the response header so the browser doesn't even try to keep-alive, plus server.closeAllConnections?.() before server.close() to terminate any sockets that did slip through. The inline comment captures the failure mode (Chrome ~5min idle).
sk_V2_… scrub gap closed (scrub.ts:16). HEYGEN_KEY is now \b(hg|sk)_[A-Za-z0-9_-]{4,} so both prefixes redact when echoed inline (not just behind a header anchor). scrub.test.ts:11-20 locks in the exact threat model (sk_V2_hgu_… in a free-text error body).
Stale-RT-on-retry fixed at the type level (client.ts:89, client.ts:140-144). The refresh hook contract changed from Promise<string> (just access_token) to Promise<OAuthTokens>, and the retry rebuilds the credential with the new refresh_token when one is returned. Compile-time enforcement means future callers can't reintroduce the bug. The JSDoc on onUnauthenticatedRefresh explains why ("any subsequent refresh on the same in-memory credential doesn't re-use a now-invalidated rotated RT"), and client.test.ts:197-228 covers the type-shape regression.
CI
All required checks green on 81aff683: Build, Test, Typecheck, Lint, Format, CLI smoke (required), Smoke: global install, Studio: load smoke, Test: runtime contract, Preview parity. Windows shards still pending but not required for this PR.
Carried-forward nits (non-blocking, won't gate)
These all carried over from the prior review and remain optional:
MIN_EXPIRES_IN_SECONDS=30is still less thanEXPIRY_SKEW_MS=60_000(oauth.ts:51). Cosmetic today; relevant if a proactive-refresh path is added.persistOAuthis still read-modify-write without an advisory lock (oauth.ts:400-419). Concurrentauth refresh+auth logincould clobber a fresh RT. Miguel called this out too — acceptable as future work.server.address() as AddressInfocast remains undefended (loopback.ts:72).revokeTokensstill swallows abort/network errors silently (oauth.ts:248).auth refreshexits 1 only onREFRESH_FAILED;API_ERRORstill bubbles to the default citty handler with a stack trace (commands/auth/refresh.ts:39-46).auth login'sreportIdentitystill warns "transient" on any verify error, including 401 from the just-minted token. A 401-specific branch would be a touch friendlier.
Approving. Ship it when Graphite mergeability and the optional Windows checks complete.
— Vai
Merge activity
|

What
Adds OAuth 2.0 + PKCE login as the default for
hyperframes auth login,plus refresh-token + 401 auto-retry +
auth refresh. Stacks on top ofPR #1081 (the API-key + shared store work).
hyperframes auth login(no flags) — opens the user's browser to/v1/oauth/authorize, captures the code on an ephemeral127.0.0.1:<port>/oauth/callback, exchanges it for tokens withPKCE S256, and persists.
--api-keyopts back into the legacylong-lived-key path from PR feat(cli): add hyperframes auth login --api-key, status, logout #1081.
hyperframes auth refresh— force-refresh the OAuth access tokenusing the stored refresh_token. Mostly useful for testing the path.
hyperframes auth logout— best-effort revokes viaPOST /v1/oauth/revoke(RFC 7009) before wiping local state.AuthClientnow refreshes-and-retries once on a 401 when thecaller wires
onUnauthenticatedRefresh.auth statuswires it.Internals added in
packages/cli/src/auth/:pkce.ts— RFC 7636 code_verifier + S256 code_challenge.loopback.ts— ephemeral 127.0.0.1 HTTP server; state validation,120s timeout, styled success/error page.
browser.ts— wrapsopenwith aBROWSER=none/HF_NO_BROWSER=1fallback that prints the URL.oauth.ts—startAuthorizationCodeFlow,refreshTokens,revokeTokens,requireOAuthConfigured,parseTokenResponse.Why
This is the foundation OAuth flow that lets free-tier users authenticate
without managing a long-lived key. Refresh + auto-retry means CLI
commands keep working past the access_token lifetime without bugging
the user.
The OAuth client_id (
q2A2QRSke2LrFTPJhoDbHtXh) is the one Jamescreated in the
oauth2_clienttable. Baked in as a build-time default;override via
HYPERFRAMES_OAUTH_CLIENT_IDfor dev/test.How
client_secret. Backend alreadyrequires PKCE (
movio/logic/oauth2.py:638).server.listen(0)) — the backendwildcards localhost ports for public clients
(
movio/model/oauth2.py:check_redirect_uri), so the registeredredirect URI's port is a placeholder.
prevent CSRF.
expires_intype (someservers return it as a string) but strict on
access_tokenpresence.AuthClient.fetchUserlayer, not thecommand layer — so future endpoints inherit it for free.
persistOAuthmerges into the existing store (preserves co-locatedapi_key).auth login(API-key path) does the symmetric thing.Test plan
vitest run src/auth/.method, distinct outputs each call.
404 non-callback paths all rejected; success path captures
code.refreshTokensposts correct body, persists, throwsREFRESH_FAILEDon 400/401 andAPI_ERRORon 5xx. Existingapi_key preserved on refresh.
NOT retry for api_key, returns 401 if refresh hook fails.
bunx oxlint/bunx oxfmt --check/bunx tscclean.bunx fallow audit --base origin/main --fail-on-issues— onlyinherited
help.ts:showUsagefinding (from main, not this PR).HEYGEN_API_URL=https://api.dev.heygen.com hyperframes auth loginthen
hyperframes auth statusthenhyperframes auth refresh.Out of scope